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The  research  conducted  on  the  contract  was  concerned  with  image  under- 
standing  techniques  applicable  to  autonomous  vehicle  navigation.  Tbtrty-six 


technical  reports  were' issued  on  the  project^Abstracts  of  these  report^vare  given,  7 '  > 
on  the  following  pages;  numbers  in  brackets  in  the  list  below  refer  to  these 


abstracts. 


1)  Time-varying  image  analysis  and  optical  flow.  [2,  6,  8,  14,  17,  21,  22,  23,  24. 

y 

26,30,34],  ^  > 


2)  Stereo  [18,  19,  28,  33],  including  dyamic  stereo  and  binocular  optical  flow  [9, 

L  \ 

15];  see  also  [27]  on  mathcing. 


3)  Range  sensing  and  analysis  of  range  data.  [31,  32]. 


> 


4)  Three-dimensional  object  recognition  and  the  geometry  of  3^  vision-  [4.  20. 

j  1 

25,  29,  35,  36]. 


5)  Tracking,  visibility,  and  path  planning,  [10,  11,  12,  16;  see  also  1,  3,  5], 

7^ 


Two  other  reports  dealt  with  color  edge  detection  [13]  and  with  software  [7], 


T?> 


ABSTRACTS  OF  TECHNICAL  REPORTS 

1.  Takashi  Matsuyama  and  Tsai-Yun  Phillips,  “Extracting  the  Medial  Axis  from 
the  Voronoi  Diagram  of  Boundary  Segments:  An  Alternative  Method 
for  Closed  Boundary  Detection.”  CAR-TR-2,  CS-TR-1261,  April  1983. 

ABSTRACT:  An  algorithm  to  recover  closed  boundaries  from  disconnected 
boundary  segments  is  presented.  There  is  a  close  relation  between  the  medial 
axis  transform  and  the  Voronoi  diagram.  Here  we  introduce  a  geomet  -ic  labeling 
scheme  for  the  Voronoi  diagram  of  boundary  segments,  and  recover  the  medial 
axes  of  closed  boundaries  by  using  the  labeled  Voronoi  diagram.  Although  all 
examples  given  in  this  paper  are  pictures  of  straight  line  segments  in  the  two- 
dimensional  Euclidean  plane,  the  basic  idea  is  immediately  applicable  to  digital 
pictures  with  curved  segments. 


2.  Hu-Chen  Xie,  Kwangyoen  Wohn,  Larry  S.  Davis  and  Azriel  Rosenfeld,  “Opti¬ 
cal  Flow  Field  Smoothing  by  Local  'Jse  of  Global  Information.”  CAR- 
TR-3,  CS-TR-1274,  April  1983. 

ABSTRACT:  This  paper  describes  a  method  for  image  motion  enhancement 
which  utilizes  global  information  about  the  motion  field,  derived  from  the  histo¬ 
grams  of  the  components  of  the  estimated  motion.  The  method  is  an  adaptation 
of  the  “superspike”  image  enhancement  algorithm  to  motion  field  estimation. 
Experiments  indicate  that  the  method  can  yield  more  accurate  and  precise  esti¬ 
mates  of  motion  than  previously  proposed  motion  estimation  algorithms. 


3.  Tsai-Yun  Phillips  and  Takashi  Matsuyama,  “The  Labeled  Discrete  Voronoi 
Diagram.”  CAR-TR-4,  CS-TR-1278,  May  1983. 

ABSTRACT:  Generalized  Voronoi  diagrams  of  sets  of  digital  curves  are  a  helpful 
tool  in  picture  analysis.  In  this  paper,  an  algorithm  for  computing  labeled  Voro¬ 
noi  diagrams  for  digital  straight  line  segments  is  given.  Special  emphasis  was 
given  to  the  use  of  a  labeled  Euclidean  distance  transform.  This  transform  is  the 
key  feature  of  the  proposed  label  propagation  process.  The  proposed  parallel 
algorithm  for  computing  labeled  Voronoi  diagrams  has  time  complexity 
0(max{M,  N})  for  input  pictures  of  size  .V  X  A/  using  a  mesh-connected  array 
processor.  The  proposed  serial  algorithm  for  computing  labeled  Voronoi  diagrams 
has  time  complexity  O(MN). 


4.  Teresa  M.  Silberberg,  Larry  Davis  and  David  Harwood,  “An  Iterative  Hough 
Procedure  for  Three-Dimensional  Object  Recognition.”  CAR-TR-20, 
CS-TR-1317,  August  1983. 

ABSTRACT:  This  paper  describes  an  iterative  Hough  procedure  for  recognizing 
images  of  three-dimensional  objects.  Straight  line  segments  in  the  image  are 
matched  by  finding  the  parameters  of  a  viewing  transformation  of  a  three- 
dimensional  model  consisting  of  line  segments.  Assuming  the  scale  of  the  object 
is  known,  there  are  three  orientation  and  two  translation  parameters  to  be 
estimated.  Initially  a  sparse,  regular  subset  of  parameters  and  transformations  is 
evaluated  for  goodness-of-fit;  then  the  procedure  is  repeated  by  successively  sub¬ 
dividing  the  parameter  space  near  current  best  estimates  of  peaks. 


5.  Takashi  Matsuyama  and  Tsai-Yun  Phillips,  “Digital  Realization  of  the 
Labeled  Voronoi  Diagram  and  Its  Application  to  Closed  Boundary 
Detection.”  CAR-TR-22,  CS-TR-132f’,  October  1983. 

ABSTRACT:  An  algorithm  to  compute  the  labeled  Voronoi  diagram  of  a  set  of 
line  segments  in  a  digital  picture  is  presented.  The  algorithm  for  extracting  the 
medial  axis  from  the  labeled  Voronoi  diagram  [l]  is  also  implemented.  Experi¬ 
mental  results  have  shown  that  both  algorithms  can  be  used  to  construct  closed 
boundaries  from  disjoint  line  segments  in  a  digital  picture. 


6.  Allen  M.  Waxman  and  Kwangyoen  Wohn,  “Contour  Evolution,  Neighborhood 

Deformation  and  Global  Image  Flow:  Planar  Surfaces  in  Motion.” 

CAR-TR-58,  CS-TR-1394,  April  1984. 

ABSTRACT:  In  the  kinematic  analysis  of  time-varying  imagery,  where  the  goal 
is  to  recover  object  surface  structure  and  space  motion  from  image  flow,  an 
appropriate  representation  for  the  flow  field  consists  of  a  set  of  deformation 
parameters  which  describe  the  rate-of-change  of  an  image  neighborhood.  In  this 
paper  we  develop  methods  for  extracting  these  deformation  parameters  from 
evolving  contours  in  an  image  sequence;  the  image  contours  being  manifestations 
of  surface  texture  of  the  underlying  image  flow;  no  heuristics  are  imposed.  The 
deformation  parameters  we  seek  are  actually  linear  combinations  of  the  Taylor 
series  coefficients  (through  second  derivatives)  of  the  local  image  flow  field.  Thus, 
a  by-product  of  our  approach  is  a  second-order  polynomial  approximation  to  the 
image  flow  in  the  neighborhood  of  a  contour.  For  curved  surfaces  this  approxi¬ 
mation  is  only  locally  valid,  but  for  planar  surfaces  it  is  globally  valid  (i.e.,  it  is 
exact).  Our  analysis  reveals  an  “aperture  problem  in  the  large”  in  which 
insufficient  contour  structure  leaves  the  set  of  twelve  deformation  parameters 
under-determined.  We  also  assess  the  sensitivity  of  our  method  to  the  simulated 
effects  of  noise  in  the  “normal  flow”  around  contours,  as  well  as  the  angular  field 
of  view  subtended  by  contours.  The  sensitivity  analysis  is  carried  out  in  the  con¬ 
text  of  planar  surfaces  executing  general  rigid  body  motions  in  space.  Future 


work  will  address  the  additional  considerations  relevant  to  curved  surface 
patches. 


7.  Fred  P.  Andresen,  “The  Franz  Lisp  —  C  Interface.”  CAR-TR-68,  CS-TR- 
1411,  June  1984. 

ABSTRACT:  The  programming  languages  Lisp  and  C  have  complementary 
powers.  Lisp  is  very  high  level  and  C  can  be  very  low  level.  For  this  reason  it 
can  be  extremely  worthwhile  to  combine  their  favorable  features.  Although 
briefly  mentioned  in  Section  8.4  of  the  Franz  Lisp  manual,  the  topic  is  not 
covered  in  detail.  This  document  elaborates  and  expands  considerably  on  that 
section  in  the  manual.  Several  examples  are  given.  A  quick  and  easy  section  for 
those  with  limited  need  is  also  included. 


8.  Sarvajit  S.  Sinha  and  Allen  M.  Waxman.  “An  Image  Flow  Simulator.”  CAR- 
TR-71,  CS-TR-1417,  July  1984. 

ABSTRACT:  The  analysis  of  time-varying  images  is  currently  of  great  interest 
in  computer  vision.  There  has  been  a  deal  of  work  recently  in  the  study  of  the 

2- D  image  flow  produced  due  to  space  motion,  and  the  recovery  of  the  object’s 

3- D  structure  and  space  motion  from  the  flow.  This  report  details  the  reverse 
process  implemented  in  the  form  of  an  Image  Flow  Simulator;  from  a  knowledge 
of  structure  and  motion,  to  display  the  2-D  image  sequence  and  associated  flow. 
This  3-D  graphics  animation  package  simulates  motion  of  objects  through  space 
and  also  the  evolution  of  surface  contours  through  time.  The  graphics  algorithms 
for  projection,  clipping,  hidden  surface  removal,  shading  and  animation  are 
described  in  this  report. 


9.  Allen  M.  Waxman  and  Sarvajit  S.  Sinha,  '‘Dynamic  Stereo:  Passive  Ranging 
to  Moving  Objects  from  Relative  Image  Flows."  CAR-TR-74,  CS-TR- 
1421.  July  1984. 

ABSTRACT:  A  new  concept  in  passive  ranging  to  moving  objects  is  described 
which  is  based  on  the  comparison  of  multiple  image  flows.  It  is  well  know  that  if 
a  static  scene  is  viewed  by  an  observer  undergoing  a  known  relative  translation 
through  space,  then  the  distance  to  objects  in  the  scene  can  be  easily  obtained 
from  the  measured  image  velocities  associated  with  features  on  the  objects  (i.e.. 
motion  stereo).  But  in  general,  individual  objects  are  translating  and  rotating  at 
unknown  rates  with  respect  to  a  moving  observer  whose  own  motion  may  not  be 
accurately  monitored.  The  net  effect  is  a  complicated  image  flow  field  in  which 
absolute  range  information  is  lost.  However,  if  a  second  image  flow  field  is  pro¬ 
duced  by  a  camera  whose  motion  through  space  differs  from  that  of  the  first  cam¬ 
era  by  a  known  amount,  the  range  information  can  be  recovered  by  subtracting 


the  first  image  flow  from  the  second.  This  “difference  flow”  must  then  be 
corrected  for  the  known  relative  rotation  between  the  two  cameras,  resulting  in  a 
divergent  relative  flow  from  a  known  focus  of  expansion.  This  passive  ranging 
process  may  be  termed  Dynamic  Stereo,  the  known  difference  in  camera  motions 
playing  the  role  of  the  stereo  baseline.  We  present  the  basic  theory  of  this  rang¬ 
ing  process,  along  with  some  examples  for  simulated  scenes.  Potential  applica¬ 
tions  are  in  autonomous  vehicle  navigation  (with  one  fixed  and  one  movable  cam¬ 
era  mounted  on  the  vehicle),  coordinated  motions  between  two  vehicles  (each  car¬ 
rying  one  fixed  camera)  for  passive  ranging  to  moving  targets,  and  in  industrial 
robotics  (with  two  cameras  mounted  on  different  parts  of  a  robot  arm)  for  inter¬ 
cepting  moving  workpieces. 


10.  Mark  F.  Doherty,  “Computation  of  Minimal  Isovist  Sets.”  CAR-TR-87,  CS- 
TR-1-136,  September  198-1. 

ABSTRACT:  A  minimal  isovist  set  (MIS)  of  a  simple  polygonal  region  P  is  a 
smallest  set  of  points  in  P  whose  union  of  isovists  equals  P  (where  the  isovist  of 
x  is  the  set  of  all  points  visible  from  x).  This  thesis  presents  an  algorithm  to 
search  for  an  MIS  for  an  arbitrary  P  An  MIS  is  shown  to  be  equivalent  to  a 
minimal  covering  of  P  with  star-shaped  polygons.  A  (non-complete)  algorithm  to 
find  a  minimal  covering  is  proposed  which  uses  the  vertices  of  the  kernels  of  the 
star-shaped  polygons.  The  complexity  of  finding  an  MIS  is  reduced  to  a  worst- 
case  consideration  of  no  more  than  nA  points  in  P.  A  comparison  of  the  pro¬ 
posed  algorithm  with  two  previously  published  algorithms  is  made.  Extension  of 
this  method  to  exterior  views  and  interior  holes  is  discussed,  and  areas  for  future 
research  are  mentioned. 


11.  Nader  Kazor,  “Target  Tracking  Based  Scene  Analysis.”  CAR-TR-88.  CS- 
TR-l-137.  August  198-1. 

.ABSTRACT:  Target  Tracking  and  3-D  Scene  Analysis  are  two  research  areas  in 
Computer  Vision  which  in  the  past  have  been  considered  separately.  However, 
there  are  many  advantages  in  combining  the  two  problems.  One  such  advantage 
would  be  the  ability  to  analyze  and  build  a  model  of  a  stationary 
scene/environment  through  which  dynamic  objects  move.  This  is  possible 
through  tracking  the  moving  objects  and  detecting  instances  of  occlusion.  This 
work  is  based  on  such  an  idea  and  is  concerned  with  the  design  of  an  Intelligent 
Target  Tracking  System  (ITTS)  which  combines  the  above  two  problems  into 
one.  In  this  paper  we  present  an  experimental  ITTS  which  generates  a  perspec¬ 
tive  and  ground  map  of  a  stationary  environment. 


12.  Frederick  P.  Andresen  and  Larry  S.  Davis,  “Visual  Position  Determination 
for  Autonomous  Vehicle  Navigation.”  CAR-TR-100,  CS-TR-1458, 
November  1984. 

ABSTRACT:  This  report  describes  a  system  by  which  an  autonomous  land  vehi¬ 
cle  might  improve  its  estimate  of  its  current  position.  This  system  selects  visible 
landmarks  from  a  database  of  knowledge  about  its  environment  and  controls  a 
camera’s  direction  and  focal  length  to  obtain  images  of  these  landmarks.  The 
landmarks  are  then  located  in  the  images  using  a  modified  version  of  the  general¬ 
ized  Hough  transform  and  their  locations  are  used  to  triangulate  to  obtain  the 
new  estimate  of  vehicle  position  and  position  uncertainty. 


13.  Matti  Pietikainen  and  David  Harwood,  “Edge  Information  in  Color  Images 
Based  on  Histograms  of  Differences. ”  CAR-TR-112.  CS-TR-1479, 
March  1985. 

ABSTRACT:  A  new  measure  of  edge  information  for  color  images  based  on 
cumulative  histograms  of  absolute  color  differences  is  proposed.  A  multispectral 
version  of  the  Symmetric  Nearest  Neighbor  filter  for  edge-preserving  smoothing 
and  methods  for  image  segmentation  and  edge  detection  are  developed  based  on 
this  measure.  Experimental  results  show  that  the  performance  of  the  new  algo¬ 
rithms  is  very  good. 


14.  Muralidhara  Subbarao  and  Allen  Waxman.  "On  the  Uniqueness  of  Image 
Flow  Solutions  for  Planar  Surfaces  in  Motion.”  CAR-TR-114.  CS-TR- 
1485,  April  1985. 

.ABSTRACT:  Two  important  results  relating  to  the  uniqueness  of  image  flow 
solutions  for  planar  surfaces  in  motion  are  presented  here.  These  results  relate  to 
the  formulation  of  the  image  flow  problem  by  Waxman  and  Ullman  [  1  ] .  which  is 
based  on  a  kinematic  analysis  of  the  image  flow  field.  The  first  result  concerns 
resolving  the  duality  of  interpretations  that  are  generally  associated  with  the 
instantaneous  image  flow  of  an  evolving  image  sequence.  It  is  shown  that  the 
interpretation  for  orientation  and  motion  of  planar  surfaces  is  unique  when  either 
two  successive  image  flows  of  one  planar  surface  patch  are  given  or  one  image 
flow  of  two  planar  patches  moving  as  a  rigid  body  is  given.  We  have  proved  this 
by  deriving  explicit  expressions  for  the  evolving  solution  of  an  image  flow 
sequence  with  time.  These  expressions  can  be  used  to  resolve  this  ambiguity  of 
interpretation  in  practical  problems.  The  second  result  is  the  proof  of  uniqueness 
for  the  velocity  of  approach  which  satisfies  the  image  flow  equations  for  planar 
surfaces  derived  in  [l].  In  addition,  it  is  shown  that  this  velocity  can  be  com¬ 
puted  as  the  middle  root  of  a  cubic  equation.  These  two  results  together  suggest 
a  new  method  for  solving  the  image  flow  problem  for  planar  surfaces  in  motion. 


15.  Allen  M.  Waxman  and  James  H.  Duncan,  “Binocular  Image  Flows:  Steps 
Toward  Stereo  -  Motion  Fusion.”  CAR-TR-119,  CS-TR-1494,  May 
1985. 

ABSTRACT:  The  analyses  of  visual  data  by  stereo  and  motion  modules  have 
typically  been  treated  as  separate,  parallel  processes  which  both  feed  a  common 
viewer-centered  2.5-D  sketch  of  the  scene.  When  acting  separately,  stereo  and 
motion  analyses  are  subject  to  certain  inherent  difficulties;  stereo  must  resolve  a 
combinatorial  correspondence  problem  and  is  further  complicated  by  the  presence 
of  occluding  boundaries,  motion  analysis  involves  the  solution  of  nonlinear  equa¬ 
tions  and  yields  a  3-D  interpretation  specified  up  to  an  undetermined  scale  factor. 
A  new  module  is  described  here  which  unifies  stereo  and  motion  analysis  in  a 
manner  in  which  each  helps  to  overcome  the  other’s  shortcomings.  One  impor¬ 
tant  result  is  a  correlation  between  relative  image  flow  (i.e.,  binocular  difference 
flow)  and  stereo  disparity;  it  points  to  the  importance  of  the  ratio  8/6 ,  rate  of 
change  of  disparity  6  to  disparity  6 ,  and  its  possible  role  in  establishing  stereo 
correspondence.  Our  formulation  may  reflect  the  human  perception  channel 
probed  by  Regan  and  Beverley  (1979). 


16.  Subbarao  Kambhampati  and  Larry  S.  Davis,  “Multiresolution  Path  Planning 
for  Mobile  Robots.”  CAR-TR-127,  CS-TR-1507,  May  1985. 

ABSTRACT:  The  problem  of  automatic  collision-free  path  planning  is  central  to 
mobile  robot  applications.  In  this  report,  we  present  an  approach  to  automatic 
path  planning  based  on  a  quadtree  representation.  We  introduce  hierarchical 
path  searching  methods,  which  make  use  of  this  multiresolution  representation, 
to  speed  up  the  path  planning  process  considerably.  Finally,  we  discuss  the 
applicability  of  this  approach  to  mobile  robot  path  planning. 


17.  Kwangyoen  Wohn  and  Allen  M.  Waxman.  “Contour  Evolution,  Neighbor¬ 
hood  Deformation  and  Local  Image  Flow:  Curved  Surfaces  in 
Motion.”  CAR-TR-134.  CS-TR-1531,  July  1985. 

.ABSTRACT:  In  our  earlier  paper  (Waxman  and  Wohn  1984),  we  developed  an 
algorithm,  the  Velocity  Functional  Method,  to  recover  an  image  flow  field  from 
time-varying  contours.  The  method  follows  directly  from  the  analytic  structure 
of  the  underlying  image  flow;  no  heuristics  are  imposed.  Local  image  flow  is 
modeled  as  a  second-order  Taylor  series.  The  method  computes  twelve  series 
coefficients  from  the  normal  component  of  image  flow  measured  along  contours. 
For  planar  surfaces  in  motion,  the  method  yields  the  exact  flow.  We  have 
demonstrated  the  robustness  of  our  algorithm  by  carrying  out  the  sensitivity 
analysis  in  the  context  of  planar  surfaces  executing  general  rigid  body  motions  in 
space. 


This  paper  explores  the  additional  aspects  of  the  theory  for  curved  surfaces, 
where  the  second-order  flow  approximation  is  only  locally  valid.  We  derive  the 
dependence  of  the  truncation  error  on  surface  curvature  and  field  of  view.  We 
also  investigate  the  sensitivity  of  solutions  to  noise  in  the  normal  flow.  The  com¬ 
bined  algorithms  of  2-D  flow  estimation  and  3-D  structure  and  motion  recovery 
are  not  as  stable  to  input  noise  and  surface  structure  as  is  the  case  for  planar  sur¬ 
faces.  The  use  of  multiple  frames  to  overcome  the  effects  of  noise  is  currently 
under  study  (cf.  Waxman  and  Wohn  1985). 


18.  Roger  D.  Eastman  and  Allen  M.  Waxman,  “Using  Disparity  Functionals  for 
Stereo  Correspondence  and  Surface  Reconstruction.”  CAR-TR-145, 
CS-TR-1547,  October  1985. 

ABSTRACT:  In  this  paper,  we  investigate  stereo  matching  constraints  that 
derive  from  an  analytic  model  of  surface  depth.  Analyticity  is  the  mathematical 
tool  by  which  we  model  smoothness  of  object  surfaces,  and  therefore  the  disparity 
field,  as  piecewise  analytic  functions  of  visual  direction.  Our  model  of  analytic 
coherence  mathematically  formulates  the  principle  of  coherence  stated  by 
Prazdny  [23],  and  can  describe  transparent  as  well  as  opaque  surfaces.  In  using 
this  property,  we  follow  the  work  in  stereo  of  Ivoenderink  and  van  Doom  [12]  and 
our  own  work  on  motion  (Waxman  and  Ullman  [31],  Waxman  [29],  Waxman  and 
Wohn  [32,  34]).  We  formulate  stereo  as  a  single  stage  process  in  which  potential 
feature  point  or  contour  matches  interact  to  provide  support  for  local  estimates 
of  a  polynomial  model  of  disparity  (the  disparity  functional,  not  just  estimates  of 
disparity  at  isolated  points.  This  refines  the  notion  of  local  support  defined  by 
Marr  and  Poggio  [17],  We  present  an  algorithm  that  integrates  the  disparity 
functional  with  multiresolution  matching  of  zero-crossings  to  derive  depth  to  sur¬ 
face  patches.  The  analyticity  of  the  disparity  field  is  thereby  exploited  early  in 
the  matching  process,  and  yields  surface  reconstruction  as  a  direct  byproduct  of 
correspondence. 


19.  Matti  Pietikainen  and  David  Harwood.  “Multiple-Camera  Contour  Stereo.” 

CAR-TR-151,  CS-TR-1559,  September  1985. 

ABSTRACT:  A  three-camera  approach  for  computational  stereo  is  presented, 
which  greatly  simplifies  the  search  problem  among  candidate  matches  and  allows 
matching  of  horizontal  edges.  Only  a  simple  camera  geometry  is  considered,  in 
which  the  images  are  rectified  in  the  same  plane.  The  horizontal  and  vertical 
images  are  equidistant  from  and  aligned  parallel  to  the  base  image.  The  primi¬ 
tive  objects  of  the  approach  are  labeled  edge  segments,  i.e.,  S-connected  chains  of 
edge  points  with  their  local  image  properties.  The  matching  algorithm  scans 
through  the  edge  segments  in  the  base  image  and  searches  for  corresponding  tri¬ 
ples  of  points  in  the  three  images."  Local  properties  of  points  are  used  to  classify 
matches.  A  preliminary  evaluation  of  matches  is  based  on  goodness  of  match 


criteria.  A  simple  postprocessing  method  based  on  contour  connectivity  is  used 
to  eliminate  false  matches.  The  method  performs  well  in  experiments.  The  basic 
matching  algorithm  generates  only  a  few  false  matches  and  most  of  these  can  be 
easily  eliminated. 


20.  Ambjorn  Naeve  and  Jan-Olof  Eklundh,  “On  Projective  Geometry  and  the 
Recovery  of  3-D  Structure."  CAR-TR-154.  CS-TR-1565.  October  1985. 

.ABSTRACT:  Geometric  properties  are  of  key  importance  in  the  recovery  of 
scene  structure  from  images.  It  is  argued  that  the  proper  formulations  of  the 
determination  of  scene  geometry  are  obtained  when  projective  geometry  is  used. 
A  framework  of  projective  geometry  for  computer  vision  is  presented  in  brief  and 
its  applicability  is  demonstrated  in  a  simple  example.  A  computational  approach 
to  finding  the  necessary  primitives  is  reviewed. 


21.  Ken-ichi  Kanatani,  “Analysis  of  Structure  and  Motion  from  Optical  Flow. 

Part  I:  Orthographic  Projection."  CAR-TR-160.  CS-TR-1576.  October 

1984.  Revised  June  1985. 

.ABSTRACT:  The  3D  structure  and  motion  of  an  object  is  determined  from  its 
optical  flow  under  orthographic  projection.  First,  the  image  domain  is  divided 
into  planar  or  almost  planar  regions  by  checking  the  flow.  For  each  region, 
parameters  of  the  flow  are  determined.  Transformation  rules  under  coordinate 
changes  and  hydrodynamic  analogies  are  also  discussed.  The  3D  structure  and 
motion  are  determined  in  explicit  forms  in  terms  of  irreducible  parameters 
deduced  from  group  representation  theory.  The  solution  is  not  unique,  contain¬ 
ing  an  indeterminate  scale  factor  and  comprising  true  and  spurious  solutions. 
Their  geometrical  interpretations  are  also  studied.  The  spurious  solution  disap¬ 
pears  if  two  or  more  regions  of  the  object  are  observed. 


22.  Ken-ichi  Kanatani.  “Analysis  of  Structure  and  Motion  from  Optical  Flow. 

Part  II:  Central  Projection."  C.AR-TR-161.  CS-TR-1577,  January 

1985.  Revised  June  1985. 

.ABSTRACT:  In  this  Part  2.  the  3D  structure  and  motion  of  an  object  is  deter¬ 
mined  from  its  optical  (low  in  central  projection.  .-Vs  in  Part  1.  the  image  domain 
is  divided  into  planar  or  almost  planar  regions  by  checking  the  flow.  For  each 
region,  parameters  of  the  flow  are  determined.  In  our  flow-based  approach,  the 
3D  structure  and  motion  are  computed  from  the  irreducible  parameters,  which 
are  complex  numbers  in  general,  deduced  from  group  representation  theory.  The 
transition  from  central  projection  via  "pseudo-orthographic  projection"  to  ortho¬ 
graphic  projection  is  also  discussed.  The  solution  is  not  unique.  Besides  the 
absolute  depth  being  indeterminate,  there  arise  two  solutions,  the  true  one  and  a 


spurious  one.  However,  the  spurious  solution  disappears  if  two  regions  of  the 
object  are  observed.  The  adjacency  condition  of  two  planar  regions  is  also  stu¬ 
died  in  terms  of  complex  variables.  The  relation  to  the  correspondence-based 
approach  is  shown,  too. 


23.  Ken-ichi  Kanatani.  "Transformation  of  Optical  Flow  by  Camera  Rotation." 

CAR-TR-163.  CS-TR-1580.  November  1085. 

ABSTRACT:  The  effect  of  camera  rotation  on  the  description  of  optical  flow  is 
analyzed.  The  transformation  law  of  the  parameters  is  explicitly  given  by  consid¬ 
ering  infinitesimal  generators  and  irreducible  reduction  of  the  induced  representa¬ 
tion  of  the  3D  rotation  group.  The  parameter  space  is  decomposed  into  invariant 
subspaces,  and  the  optical  flow  is  accordingly  decomposed  into  two  parts,  from 
which  an  invariant  basis  is  deduced.  A  procedure  is  presented  to  test  the 
equivalence  of  two  optical  flows  and  to  reconstruct  the  necessary  amount  of  cam¬ 
era  rotation.  The  relationship  with  the  analytical  expressions  for  3D  recovery  is 
also  discussed. 


24.  Ken-ichi  Kanatani,  “Structure  from  Motion  without  Correspondence:  Gen¬ 
eral  Principle.”  CAR-TR-161.  C'S-TR-1581.  November  1985. 

.ABSTRACT:  A  general  principle  is  given  for  detecting  3D  structure  and  motion 
from  an  image  sequence  without  using  point-to-point  correspondence.  The  pro¬ 
cedure  consists  of  two  stages:  (i)  determination  of  the  flow  parameters .  which 
completely  characterize  the  motion  of  the  planar  part  of  the  object,  and  (ii)  com¬ 
putation  of  3D  recovery  from  these  flow  parameters.  The  first  stage  is  done  by 
measuring  features  of  the  image  sequence.  The  second  stage  is  analytically 
expressed  in  terms  of  invariants  with  respect  to  coordinate  changes.  Typical 
features  and  relations  to  stepwise  tracing  are  also  discussed. 


25.  Ken-ichi  Kanatani.  “The  Constraints  on  Images  of  Rectangular  Polyhedra." 
CAR-TR-165.  C'S-TR-1582.  November  1085. 

ABSTRACT:  This  paper  discusses  how  polyhedron  interpretation  techniques  are 
simplified  if  the  objects  are  rectangular  trihedral  polyhedra.  This  restriction 
enables  one  to  compute  the  spatial  orientation  of  a  given  corner  and  its  motion 
from  its  image.  The  solution  is  expressed  in  terms  of  polar  coordinates.  Hulerian 
angles  and  quaternions.  Then,  based  on  the  fact  that  the  transformations  map¬ 
ping  eight  possible  configurations  of  the  rectangular  corner  to  each  other  form  a 
group  isomorphic  to  Z>  X  Zo  X  Z ,.  the  corner  configurations,  their  transforma¬ 
tions.  spatial  orientations  and  states  of  face  adjacency  are  expressed  by  triplets  of 
binary  bits,  and  the  conditions  constraining  relationships  among  them  are 
described  in  algebraic  equations  in  terms  of  those  triplets.  Finally,  the  visibility 


conditions  are  formulated,  and  an  algorithm  of  shape  interpretation  and  hidden 
line  detection  from  “local”  information  is  presented.  Some  examples  are  given  to 
compare  our  scheme  with  existing  ones.  The  possible  non-uniqueness  of  the 
interpretation  and  the  effect  of  projective  distortion  are  also  discussed. 


26.  Muralidhara  Subbarao,  “Interpretation  of  Image  Motion  Fields:  A  Spatio- 
Temporal  Approach.”  CAR-TR-167,  CS-TR-1589,  December  1985. 

ABSTRACT:  In  this  paper  we  describe  a  new  formulation  of  the  image  motion 
interpretation  problem.  The  formulation  addresses  the  general  problem  of  recov¬ 
ering  the  3D  local  surface  structure,  motion  and  deformation  of  an  opaque  object. 
It  is  based  on  the  assumption  of  local  analyticity  of  3D  surface  structure,  motion, 
deformation  and,  consequently,  the  corresponding  2D  image  motion  in  the  space- 
time  domain.  In  this  approach  both  spatial  and  temporal  information  are  used  in 
a  uniform  way.  The  formulation  is  very  general  in  the  sense  that  as  long  as  the 
analyticity  assumption  is  valid,  the  space  and  time  dependence  of  surface  struc¬ 
ture  and  motion  can  be  related  to  the  image  motion  parameters.  We  illustrate 
our  approach  by  formulating  and  solving  the  image  motion  interpretation  prob¬ 
lem  for  some  simple  cases  including  non-rigid  and  non-uniform  motions.  However, 
it  can  be  easily  extended  to  deal  with  more  complicated  cases. 

For  rigid  and  uniform  motions  we  have  solved  the  problem  for  three  impor¬ 
tant  cases.  The  first  two  relate  to  the  case  where  the  image  motion  is  observed 
in  a  fixed  image  neighborhood  and  the  other  case  is  where  the  camera  tracks  a 
fixed  point  on  the  object  in  motion  and  the  tracking  motion  of  the  camera  is 
known.  In  all  these  three  cases  we  have  solved  for  the  local  orientation  and  rigid 
motion  of  the  surface  patch  around  the  line  of  sight  using  only  the  first-order 
spatial  and  temporal  derivatives  of  the  image  velocity  field.  In  comparison,  all  the 
existing  methods  based  on  image  motion  fields  use  up  to  second-order  spatial 
derivatives  of  the  image  velocity  field  which  are  relatively  sensitive  to  noise. 


27.  Eliahu  Wasserstrom,  “Subpixel  Registration.”  CAR-TR-173,  CS-TR-1601, 
January  1986. 

ABSTRACT:  A  method  of  subpixel  image  registration  is  proposed  that  employs 
a  model  of  the  correlation  surface  in  the  vicinity  of  the  registration  point. 
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28.  Fang-Jyh  Lin  Liu,  Roger  Eastman  and  Larry  S.  Davis,  “Experiments  in 
Stereo  Matching  using  Multiresolution  Local  Support.”  CAR-TR-183, 
CS-TR-1617,  January'  1986. 

ABSTRACT:  This  paper  describes  a  set  of  computational  algorithms  for  stereo 
matching  based  on  multiresolution  local  support.  The  algorithms  combine  the 
feature-point  based  coarse-to-fine  matching  of  Marr-Poggio-Grimson,  the  local 
support  measures  of  Pollard-Mayhew-Frisby  and  Prazdny,  and  other  disambigua¬ 
tion  techniques.  The  matching  primitives  are  zero-crossing  points.  Matching 
compatibility  is  based  on  the  sign  of  the  contrast  and  the  gradient  direction  at 
the  Laplacian  zero-crossing  point.  These  algorithms  use  coarse  resolution  dispar¬ 
ity  to  constrain  the  disparity  search  window  in  the  fine  resolution  matching  pro¬ 
cess,  which  speed  up  the  search  and  greatly  improve  the  accuracy  of  matching. 
The  consistency  measure  includes  Gaussian  local  support  or  disparity  gradient 
threshold  local  support  in  combination  with  symmetric  or  iterative  disambigua¬ 
tion  techniques.  This  paper  includes  experiments  performed  on  a  variety  of 
stereo  images  containing  synthetic,  laboratory  and  aerial  scenes. 


29.  Ken-ichi  Kanatani  and  Tsai-Chia  Chou,  “Shape  from  Texture:  General 
Principle.”  CAR-TR-184,  CS-TR-1618,  February  1986. 

ABSTRACT:  The  3D  shape  of  a  textured  surface  is  recovered  from  its  projected 
image  on  the  assumption  that  the  texture  is  homogeneously  distributed.  First, 
the  homogeneity  of  a  discrete  texture  consisting  of  dots  and  line  segments  is 
defined  in  terms  of  the  theory  of  distributions.  Next,  distortion  of  the  observed 
texture  density  due  to  perspective  projection  is  described  in  terms  of  the  first  fun¬ 
damental  form,  which  is  expressed  with  respect  to  the  image  coordinate  system. 
Based  on  this  result,  the  basic  equations  to  determine  the  surface  shape  are 
derived  for  both  planar  and  curved  surfaces,  and  numerical  schemes  are  proposed 
to  solve  them.  Necessary  data  are  obtained  in  the  form  of  summation  or  integra¬ 
tion  of  functions  over  the  texture  elements  on  the  image  plane.  Ambiguity  in  the 
interpretation  of  curved  surfaces  is  also  analyzed.  Finally,  numerical  examples 
for  synthetic  data  are  presented,  and  our  method  is  compared  with  other  existing 
methods.  It  is  shown  that  all  other  methods  can  be  explained  in  terms  of  our  for¬ 
mulation. 


30.  Allen  M.  Waxman,  Behrooz  Kamgar-Parsi  and  Muralidhara  Subbarao, 
“Closed-Form  Solutions  to  Image  Flow  Equations  for  3-D  Structure 
and  Motion."  CAR-TR-190,  CS-TR-1633,  February  1986. 

ABSTRACT:  A  major  source  of  three-dimensional  (3-D)  information  about 
objects  in  the  world  is  available  to  the  observer  in  the  form  of  time-varying 
imagery.  Relative  motion  between  textured  objects  and  observer  generates  a 
time-varying  optic  array  at  the  image,  from  which  image  motion  of  contours, 
edge  fragments  and  feature  points  can  be  extracted.  These  dynamic  features 


serve  to  sample  the  underlying  “image  flow”  field.  New,  closed-form  solutions 
are  given  for  the  structure  and  motion  of  planar  and  curved  surface  patches  from 
monocular  image  flow  and  its  derivatives  through  second  order.  Both  planar  and 
curved  surface  solutions  require  at  most,  the  solution  of  a  cubic  equation.  The 
analytic  solution  for  curved  surface  patches  combines  the  transformation  of 
Longuet-Higgins  and  Prazdny  (1980)  with  the  planar  surface  solution  of  Subbarao 
and  Waxman  (1985).  New  insights  regarding  uniqueness  of  solutions  also  emerge. 
Thus,  the  “structure-motion  coincidence”  of  Waxman  and  Ullman  (1983)  is  inter¬ 
preted  as  the  “duality  of  tangent  plane  solutions.”  The  multiplicity  of  transfor¬ 
mation  angles  (up  to  three)  is  related  to  the  sign  of  the  Gaussian  curvature  of  the 
surface  patch.  Ovoid  patches  (i.e.,  bowls)  are  shown  to  possess  a  unique 
transform  angle,  though  they  are  subject  to  the  local  structure-motion  coin¬ 
cidence.  Thus,  ovoid  patches  almost  always  yield  a  unique  3-D  interpretation.  In 
general,  ambiguous  solutions  can  be  resolved  by  requiring  continuity  of  the  solu¬ 
tion  over  time. 


31.  Jacqueline  Le  Moigne  and  Allen  M.  Waxman,  “Structured  Light  Patterns  for 
Robot  Mobility.”  CAR-TR-191,  CS-TR-1634,  February  1986. 

ABSTRACT:  In  order  to  assess  the  feasibility  of  using  a  structured-light  range 
sensor  for  mobile  outdoor  and  indoor  robots,  we  discuss  a  number  of  operational 
considerations  and  image  processing  tools  relevant  to  this  task  domain.  In  partic¬ 
ular,  we  address  the  issues  of  operating  in  ambient  lighting,  smoothing  of  range 
texture,  grid  pattern  selection,  albedo  normalization,  grid  extraction  and  coarse 
registration  of  image  to  projected  grid.  Once  a  range  map  of  the  immediate 
environment  is  obtained,  short  range  path  planning  can  be  attempted. 


32.  Lima  Kant  Sharma  and  Larry  S.  Davis,  “Road  Following  by  an  Autonomous 
Vehicle  Using  Range  Data.”  CAR-TR-194,  CS-TR-1639,  March  1986. 

.ABSTRACT:  This  paper  describes  a  road  following  system  for  an  Autonomous 
Land  Vehicle.  Range  data  is  used  as  the  sensor  input.  The  system  is  divided 
into  two  parts:  low-level  data  driven  analysis  followed  by  high-level  model- 
directed  search.  The  sequence  of  steps  performed,  in  order  to  detect  3-D  road 
boundaries,  is  as  follows:  Range  data  is  first  converted  from  spherical  into  Carte¬ 
sian  coordinates.  A  quadric  (or  planar)  surface  is  then  fitted  to  the  neighborhood 
of  each  range  pixel,  using  a  least  square  fit  method.  Based  on  this  fit,  minimum 
and  maximum  principal  surface  curvatures  are  computed  at  each  point  to  detect 
edges.  Next,  using  Hough  transform  techniques  3-D  local  line  segments  are 
extracted.  Finally,  model  directed  reasoning  is  applied  to  detect  the  road  boun¬ 
daries. 
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33.  Behrooz  Ivamgar-Parsi,  “Practical  Computation  of  Pan  and  Tilt  Angles.”  jfl 

CAR-TR-195,  CS-TR-1640,  March  1986.  3 

ABSTRACT:  The  sensitivity  of  the  3-D  recovery  from  a  stereo  pair  of  images  of  fl 

object  points  in  space  to  the  errors  in  the  pan  and  tilt  angles  is  studied.  It  is  m 

shown  that  precise  knowledge  of  the  relative  pan  angle  of  the  two  cameras  with  v 

respect  to  each  other  (and  to  a  lesser  degree  the  relative  tilt  angle)  is  crucial  to  ® 

the  accurate  3-D  recovery  of  object  points  in  space,  whereas  accurate  knowledge  ■ 

of  the  pan  and  tilt  angles  relative  to  the  scene,  i.e.  the  “world”  angles,  is  of  less  3 

significance.  This  indicates  that  the  most  important  task  would  be  the  computa¬ 
tion  of  the  relative  pan  and  tilt  angles.  Theoretically,  it  is  well  known  that  using 
corresponding  left  and  right  image  positions,  one  can  calculate  the  orientation  of 
the  two  cameras.  Limited  resolution  capabilities  of  existing  cameras,  however, 
makes  this  task  a  difficult  one.  Practical  difficulties  in  computation  of  camera 
orientations  are  discussed.  It  is  shown  that  while  computation  of  the  relative  tilt 
angle  is  fairly  easy,  computation  of  the  relative  pan  and  in  particular  the  world 
pan  angles  are  difficult.  Indeed  for  many  image  pairs  it  may  not  be  possible  to 
compute  the  world  pan  angle  with  any  degree  cf  reliability.  However,  it  is  shown 
that  often  it  is  possible  to  bypass  the  computation  of  the  world  pan  angle  and  to 
compute  the  relative  pan  and  tilt  angles  directly.  This  is  despite  the  fact  that  in 
the  analytic  formulation  of  the  problem  the  three  angles  are  coupled.  The  stereo 
model  studied  here  is  assumed  to  have  a  fixed  baseline  and  small  relative  pan  and 
tilt  angles.  A  possible  application  of  such  a  stereo  model  is  the  visual  system  of 
an  autonomous  vehicle  whose  task  is  road  following. 


3d.  Muralidhara  Subbarao,  “Interpretation  of  Image  Motion  Fields:  Rigid 
Curved  Surfaces  in  Motion.”  CAR-TR-199.  CS-TR-165-1,  April  1986. 

ABSTRACT:  This  paper  is  concerned  with  recovering  the  three-dimensional 
shape  and  motion  of  a  rigid  surface  from  its  image  motion  field  on  a  camera’s 
image  plane.  A  partial  solution  to  this  problem  was  proposed  by  Longuet-Higgins 
and  Prazdny  [l],  and  recently  a  more  complete  solution  has  been  obtained  by 
VVaxman,  Ivamgar-Parsi  and  Subbarao  [2].  Here  we  reconsider  this  problem  in  the 
context  of  our  recent  work  [3]  where  a  general  formulation  and  solution  procedure 
is  proposed  for  the  interpretation  of  image  motion  fields.  Losing  this  new 
approach,  closed-form  solutions  are  derived  for  the  motion,  orientation  and  cur¬ 
vatures  of  a  rigid  surface.  In  comparison  with  the  previous  approaches  [1.2]  this 
approach  does  not  involve  rotating  the  image  coordinate  system  in  order  to  solve 
for  the  unknowns.  The  solution  is  obtained  directly  in  the  original  coordinate  sys¬ 
tem,  thus  saving  some  computation.  More  importantly  we  state  and  prove  some 
fundamental  theoretical  results  concerning  the  existence  of  multiple  interpreta¬ 
tions  for  an  instantaneous  image  motion  field  resulting  from  a  rigid  body  motion. 
Conditions  for  the  occurrence  of  up  to  four  (four  being  the  maximum  possible) 
solutions  are  stated  and  proved.  Numerical  examples  are  given  for  some  interest¬ 
ing  cases  where  multiple  solutions  exist.  The  results  are  presented  in  a  sequential 
order  which  suggests  a  straightforward  implementation  of  the  solution  method. 


This  work,  along  with  our  previous  work  [3,12]  suggests  a  unified  computational 
approach  for  the  interpretation  of  image  motion  fields  in  a  variety  of  situations 
(e.g.:  planar/curved  surfaces  using  spatial  and/or  temporal  image  flow  parame¬ 
ters,  rigid/non-rigid  motion,  etc.). 


35.  Ken-ichi  Kanatani,  “Constraints  on  Length  and  Angie.”  CAR-TR-200,  CS- 
TR-1655,  April  1986. 

ABSTRACT:  Given  a  perspective  projection  of  line  segments  on  the  image 
plane,  the  constraints  on  their  3D  positions  and  orientations  are  derived  when 
their  true  length  or  the  true  angles  they  make  are  known.  The  line  segments 
under  consideration  are  first  mapped  to  the  center  of  the  image  plane  as  if  the 
camera  were  rotated  to  aim  at  them.  Then,  the  constraints  are  given  by  the 
geometry  of  perspective  transformation,  and  the  relations  obtained  are 
transformed  back  to  the  original  configuration  in  the  scene.  An  application  is 
given  to  the  interpretation  of  rectangular  corneis  of  polyhedra. 


36.  Ken-ichi  Kanatani,  “Camera  Rotation  Invariance  of  Image  Characteristics.” 

CAR-TR-202,  CS-TR-1663,  May  1986. 

ABSTRACT:  The  image  transformation  due  to  camera  rotation  relative  to  a  sta¬ 
tionary  scene  is  analyzed,  and  the  associated  transformation  rules  of  “features” 
given  by  weighted  averaging  of  the  image  are  derived  by  considering  infinitesimal 
generators  on  the  basis  of  group  representation  theory.  Three  dimensional  vec¬ 
tors  and  tensors  are  reduced  to  two  dimensional  invariants  on  the  image  plane 
from  the  viewpoint  of  projective  geometry.  Three  dimensional  invariants  and 
camera  rotation  reconstruction  are  also  discussed.  The  result  is  applied  to  the 
shape  recognition  problem  when  camera  rotation  is  involved. 
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